PATENT 
51270-277029 

IN THE CLAIMS : 

Please amend claims 1, 7, 11, and 15-17, and add new claims 18-25, as follows: 

1 . (Currently amended) A singing voice synthesizing apparatus comprising: 

a phoneme database that stores a plurality of voice fragment data formed of voice 
fragments each being a single phoneme or a phoneme chain of at least two concatenated 
phonemes, each of the plurality of voice fragment data comprising data of a deterministic 
component and data of a stochastic component; 

an input device that inputs lyrics; 

a readout device that reads out from said phoneme database the voice fragment data 
corresponding to the inputted lyrics; 

a duration time adjusting device that adjusts time duration of the read-out voice fragment 
data so as to match a desired tempo and manner of singing; 

an adjusting device that adjusts the deterministic component and the stochastic 
component of the read-out voice fragment so as to match a desired pitc h, said adjusting device 
being configured to adjust the stochastic component by varying a low frequency region of an 
amplitude spectrum of the stochastic component according to the desired pitch ; and 

a synthesizing device that synthesizes a singing sound by sequentially concatenating the 
voice fragment data that have been adjusted by said duration time adjusting device and said 
adjusting device. 
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(Original) A singing voice synthesizing apparatus according to claim 1, wherein said 
phoneme database stores a plurality of voice fragment data having different musical 
expressions for a single phoneme or phoneme chain. 

(Original) A singing voice synthesizing apparatus according to claim 2, wherein said 
musical expressions include at least one parameter selected from the group consisting of 
pitch, dynamics and tempo. 

(Original) A singing voice synthesizing apparatus according to claim 1, wherein said 
phoneme database stores voice fragment data comprising elongated sounds that are each 
enunciated by elongating a single phoneme, voice fragment data comprising consonant- 
to-vowel phoneme chains and vowel-to-consonant phoneme chains, voice fragment data 
comprising consonant-to-consonant phoneme chains, and voice fragment data comprising 
vowel-to-vowel phoneme chains. 

(Original) A singing voice synthesizing apparatus according to claim 1, wherein each 
of said voice fragment data comprises a plurality of data corresponding respectively to a 
plurality of frames of a frame string formed by segmenting a corresponding one of the 
voice fragments, and wherein the data of the deterministic component and the data of the 
stochastic component of each of said voice fragment data each comprise a series of 
frequency domain data corresponding respectively to the plurality of frames of the frame 
string corresponding to each of the voice fragments. 

(Original) A singing voice synthesizing apparatus according to claim 5, wherein said 
duration time adjusting device generates a frame string of a desired time length by 
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repeating at least one frame of the plurality of frames of the frame string corresponding to 
each of the voice fragments, or by thinning out a predetermined number of frames of the 
plurality of frames of the frame string corresponding to each of the voice fragments. 

7. (Currently amended) A singing voice synthesizing apparatus according to claim 6, 
comprising: 

a phoneme database that stores a plurality of voice fragment data formed of voice 
fragments each being a single phoneme or a phoneme chain of at least two concatenated 
phonemes, each of the plurality of voice fragment data comprising data of a deterministic 
component and data of a stochastic component; 

an input device that inputs lyrics; 

a readout device that reads out from said phoneme database the voice fragment data 
corresponding to the inputted lyrics; 

a duration time adjusting device that adjusts time duration of the read-out voice fragment 
data so as to match a desired tempo and manner of singing; 

an adjusting device that adjusts the deterministic component and the stochastic 
component of the read-out voice fragment so as to match a desired pitch; and 

a synthesizing device that synthesizes a singing sound by sequentially concatenating the 
voice fragment data that have been adjusted by said duration time adjusting device and said 
adjusting device. 
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wh e r e in wherein: 

each of said voice fragment data comprises a plurality of data corresponding respectively 
to a plurality of frames of a frame string formed by segmenting a corresponding one of the voice 
fragments; 

the data of the deterministic component and the data of the stochastic component of each 
of said voice fragment data each comprise a series of frequency domain data corresponding 
respectively to the plurality of frames of the frame string corresponding to each of the voice 
fragments; and 

said duration time adjusting device generates th e fram e a frame string of a desired time 
length by repeating a plurality of frames of the frame string corresponding to each of the voice 
fragments, said duration time adjusting device repeating the plurality of frames in a first 
direction in which the frame string of a desired time length is generated and in a second direction 
opposite thereto. 

8. (Original) A singing voice synthesizing apparatus according to claim 7, wherein 
when repeating the plurality of frames of the frame string corresponding to the data of the 
stochastic component of each of the voice fragments in the first and second directions, 
said duration time adjusting device reverses a phase of a phase spectrum of the stochastic 
component. 

9. (Original) A singing voice synthesizing apparatus according to claim 1, further 
comprising a fragment level adjusting device that performs smoothing processing or level 
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adjusting processing on the deterministic component and the stochastic component 
contained in each of the voice fragment data when the voice fragment data are 
sequentially concatenated by said synthesizing device. 

10. (Original) A singing voice synthesizing apparatus according to claim 5, further 
comprising a deterministic component generating device that changes only pitch of the 
deterministic component to a desired pitch while preserving the spectral envelope shape 
of the deterministic component contained in each of the voice fragment data when the 
voice fragment data are sequentially concatenated by said synthesizing device. 

11. (Currently amended) A singing voice synthesizing apparatus according to claim 5, 
comprising: 

a phoneme database that stores a plurality of voice fragment data formed of voice 
fragments each being a single phoneme or a phoneme chain of at least two concatenated 
phonemes, each of the plurality of voice fragment data comprising data of a deterministic 
component and data of a stochastic component; 

an input device that inputs lyrics; 

a readout device that reads out from said phoneme database the voice fragment data 
corresponding to the inputted lyrics; 

a duration time adjusting device that adjusts time duration of the read-out voice fragment 
data so as to match a desired tempo and manner of singing; 



20558946vl 



10 



PATENT 
51270-277029 



an adjusting device that adjusts the deterministic component and the stochastic 
component of the read-out voice fragment so as to match a desired pitch; and 

a synthesizing device that synthesizes a singing sound by sequentially concatenating the 
voice fragment data that have been adjusted by said duration time adjusting device and said 
adjusting device, 

wh e r e in wherein: 

each of said voice fragment data comprises a plurality of data corresponding respectively 
to a plurality of frames of a frame string formed by segmenting a corresponding one of the voice 
fragments; 

the data of the deterministic component and the data of the stochastic component of each 
of said voice fragment data each comprise a series of frequency domain data corresponding 
respectively to the plurality of frames of the frame string corresponding to each of the voice 
fragments; and 

said phoneme database stores voice fragment data comprising elongated sounds that are 
each enunciated by elongating a single phoneme, said phoneme database further storing a flat 
spectrum as an amplitude spectrum of the stochastic component of each of the voice fragment 
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data comprising each of the elongated sounds, obtained by multiplying the amplitude spectrum 
thereof by an inverse of a typical spectrum within an interval of the elongated sound. 

12. (Original) A singing voice synthesizing apparatus according to claim 11, wherein the 
amplitude spectrum of the stochastic component of each of the voice fragment data 
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comprising each of the elongated sounds is obtained by multiplying an amplitude 
spectrum of the stochastic component calculated based on an amplitude spectrum of the 
deterministic component of the voice fragment data of the elongated sound, by the flat 
spectrum. 

13. (Original) A singing voice synthesizing apparatus according to claim 12, wherein 
said phoneme database does not store amplitude spectra of stochastic components of 
voice fragment data comprising certain elongated sounds, and the flat spectrum stored as 
an amplitude spectrum of voice fragment data comprising at least one other elongated 
sound is used for synthesis of the certain sounds. 

14. (Original) A singing voice synthesizing apparatus according to claim 12, wherein the 
amplitude spectrum of the stochastic component calculated based on the amplitude 
spectrum of the deterministic component has a gain thereof at 0Hz controlled according 
to a parameter for controlling a degree of huskiness. 

15. (Currently amended) A singing voice synthesizing method comprising the steps of: 

storing in a phoneme database a plurality of voice fragment data formed of voice 
fragments each being a single phoneme or a phoneme chain of at least two concatenated 
phonemes, each of said plurality of voice fragment data comprising data of a deterministic 
component and data of a stochastic component; 

reading out from said phoneme database the voice fragment data corresponding to lyrics 
inputted by an input device; 
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adjusting time duration of the read-out voice fragment data so as to match a desired 
tempo and manner of singing; 

adjusting the deterministic component and the stochastic component of the read-out voice 
fragment so as to match a desired pitc h, said stochastic component being adjusted by varying a 
low frequency region of an amplitude spectrum of the stochastic component according to the 
desired pitch ; and 

synthesizing a singing sound by sequentially concatenating the voice fragment data that 
have been adjusted in respect of the time duration and the deterministic component and the 
stochastic component thereof. 

16. (Currently amended) A program for causing a computer to execute a singing voice 
synthesizing method comprising the steps of: 

storing in a phoneme database a plurality of voice fragment data formed of voice 
fragments each being a single phoneme or a phoneme chain of at least two concatenated 
phonemes, each of said plurality of voice fragment data comprising data of a deterministic 
component and data of a stochastic component; 

reading out from said phoneme database the voice fragment data corresponding to lyrics 
inputted by an input device; 

adjusting time duration of the read-out voice fragment data so as to match a desired 
tempo and manner of singing; 
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adjusting the deterministic component and the stochastic component of the read-out voice 
fragment so as to match a desired pitc h, said stochastic component being adjusted by varying a 
low frequency region of an amplitude spectrum of the stochastic component according to the 
desired pitch ; and 

synthesizing a singing sound by sequentially concatenating the voice fragment data that 
have been adjusted in respect of the time duration and the deterministic component and the 
stochastic component thereof 

17. (Currently amended) A mechanically readable storage medium storing instructions for 
causing a machine to execute a singing voice synthesizing method comprising the steps of: 

storing in a phoneme database a plurality of voice fragment data formed of voice 
fragments each being a single phoneme or a phoneme chain of at least two concatenated 
phonemes, each of said plurality of voice fragment data comprising data of a deterministic 
component and data of a stochastic component; 

reading out from said phoneme database the voice fragment data corresponding to lyrics 
inputted by an input device; 

adjusting time duration of the read-out voice fragment data so as to match a desired 
tempo and manner of singing; 

adjusting the deterministic component and the stochastic component of the read-out voice 
fragment so as to match a desired pitch , said stochastic component being adjusted by varying a 
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low frequency region of an amplitude spectrum of the stochastic component according to the 
desired pitch ; and 

synthesizing a singing sound by sequentially concatenating the voice fragment data that 
have been adjusted in respect of the time duration and the deterministic component and the 
stochastic component thereof. 

18. (New) A singing voice synthesizing apparatus according to claim 1, wherein said 
adjusting device adjusts the stochastic component by using an original amplitude 
spectrum for a high frequency region of the amplitude spectrum of the stochastic 
component. 

19. (New) A singing voice synthesizing apparatus according to claim 1, wherein said 
adjusting device varies the low frequency region of the amplitude spectrum by 
compressing or expanding a frequency axis for the low frequency region of the amplitude 
spectrum of the stochastic component according to the desired pitch, with a general shape 
of the amplitude spectrum preserved. 

20. (New) A singing voice synthesizing method according to claim 15, wherein, in said step 
of adjusting the deterministic and stochastic components, the stochastic component is 
adjusted by using an original amplitude spectrum for a high frequency region of the 
amplitude spectrum of the stochastic component. 

21. (New) A singing voice synthesizing method according to claim 15, wherein, in said step 
of adjusting the deterministic and stochastic components, the low frequency region of the 
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amplitude spectrum is varied by compressing or expanding a frequency axis for the low 
frequency region of the amplitude spectrum of the stochastic component according to the 
desired pitch, with a general shape of the amplitude spectrum preserved. 

22. (New) A program for causing a computer to execute a singing voice synthesizing method 
according to claim 16, wherein, in said step of adjusting the deterministic and stochastic 
components, the stochastic component is adjusted by using an original amplitude 
spectrum for a high frequency region of the amplitude spectrum of the stochastic 
component. 

23. (New) A program for causing a computer to execute a singing voice synthesizing method 
according to claim 16, wherein, in said step of adjusting the deterministic and stochastic 
components, the low frequency region of the amplitude spectrum is varied by 
compressing or expanding a frequency axis for the low frequency region of the amplitude 
spectrum of the stochastic component according to the desired pitch, with a general shape 
of the amplitude spectrum preserved. 

24. (New) A mechanically readable storage medium storing instructions for causing a 
machine to execute a singing voice synthesizing method according to claim 17, wherein, 
in said step of adjusting the deterministic and stochastic components, the stochastic 
component is adjusted by using an original amplitude spectrum for a high frequency 
region of the amplitude spectrum of the stochastic component. 

25. (New) A mechanically readable storage medium storing instructions for causing a 
machine to execute a singing voice synthesizing method according to claim 17, wherein, 
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in said step of adjusting the deterministic and stochastic components, the low frequency 
region of the amplitude spectrum is varied by compressing or expanding a frequency axis 
for the low frequency region of the amplitude spectrum of the stochastic component 
according to the desired pitch, with a general shape of the amplitude spectrum preserved. 
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